Goto

Collaborating Authors

 random forest approach


A U-Statistic-based random forest approach for genetic interaction study

arXiv.org Artificial Intelligence

Variations in complex traits are influenced by multiple genetic variants, environmental risk factors, and their interactions. Though substantial progress has been made in identifying single genetic variants associated with complex traits, detecting the gene-gene and gene-environment interactions remains a great challenge. When a large number of genetic variants and environmental risk factors are involved, searching for interactions is limited to pair-wise interactions due to the exponentially increased feature space and computational intensity. Alternatively, recursive partitioning approaches, such as random forests, have gained popularity in high-dimensional genetic association studies. In this article, we propose a U-Statistic-based random forest approach, referred to as Forest U-Test, for genetic association studies with quantitative traits. Through simulation studies, we showed that the Forest U-Test outperformed existing methods. The proposed method was also applied to study Cannabis Dependence CD, using three independent datasets from the Study of Addiction: Genetics and Environment. A significant joint association was detected with an empirical p-value less than 0.001. The finding was also replicated in two independent datasets with p-values of 5.93e-19 and 4.70e-17, respectively.


Machine Learning for Multi-Output Regression: When should a holistic multivariate approach be preferred over separate univariate ones?

arXiv.org Machine Learning

The hope of such multivariate analyses is, that the consideration of possible dependencies between the outcomes may lead to procedures with better power (in case of inference) or accuracy (in case of prediction) compared to separate univariate analyses. While the need for the development and use of valid and distributional robust or nonparametric multivariate methods has been recognized and addressed in inferential statistic (Dobler et al., 2020; Friedrich et al., 2019; Konietschke et al., 2015; Smaga, 2017; Vallejo and Ato, 2012; Zimmermann et al., 2020), there do not exist exhausting studies that exploit the potential of multivariate regression methods for prediction. Focussing on tree-based ensemble methods as the Random Forest, it is the aim of this manuscript to close this gap. In particular, we want to answer our research-motivating question: When should a holistic multivariate regression approach be preferred over separate univariate predictions? Corresponding Author Email address: lena.schmid@tu-dortmund.de (Lena Schmid)


Can Machine Learning Improve Recession Prediction?

#artificialintelligence

They can only give you answers." Big data utilization in economics and the financial world has increased with each passing day. In previous reports, we have discussed issues and opportunities related to big data applications in economics/finance. This piece is a quick summary of a more-detailed report that outlines a framework to utilize machine learning and statistical data mining tools in the economics/financial world with the goal of more accurately predicting recessions. Decision makers have a vital interest in predicting future recessions in order to enact appropriate policy.


Can Machine Learning Improve Recession Prediction?

#artificialintelligence

They can only give you answers." Big data utilization in economics and the financial world has increased with each passing day. In previous reports, we have discussed issues and opportunities related to big data applications in economics/finance. This piece is a quick summary of a more-detailed report that outlines a framework to utilize machine learning and statistical data mining tools in the economics/financial world with the goal of more accurately predicting recessions. Decision makers have a vital interest in predicting future recessions in order to enact appropriate policy.


Can Machine Learning Improve Recession Prediction?

#artificialintelligence

Big data utilization in economics and the financial world has increased with every passing day. In previous reports, we have discussed issues and opportunities related to big data applications in economics/finance.1 This report outlines a framework to utilize machine learning and statistical data mining tools in the economics/financial world with the goal of more accurately predicting recessions. Decision makers have a vital interest in predicting future recessions in order to enact appropriate policy. Therefore, to help decision makers, we raise the question: Does machine learning and statistical data mining improve recession prediction accuracy?


Learning from Disaster – The Random Forest Approach.

#artificialintelligence

Having tried logistic regression the first time around, I moved on to decision trees and KNN. But unfortunately, those models performed horribly and had to be scrapped. Random Forest seemed to be the buzz word around the Kaggle forums, so I obviously had to try it out next. I took a couple of days to read up on it, worked out a few examples on my own before re-taking a stab at the titanic dataset. The'caret' package is a beauty.